Goto

Collaborating Authors

 Family Law




Assessing the Reliability of Large Language Models in the Bengali Legal Context: A Comparative Evaluation Using LLM-as-Judge and Legal Experts

Aftahee, Sabik, Farhad, A. F. M., Mallik, Arpita, Dhar, Ratnajit, Karim, Jawadul, Noor, Nahiyan Bin, Solaiman, Ishmam Ahmed

arXiv.org Artificial Intelligence

Accessing legal help in Bangladesh is hard. People face high fees, complex legal language, a shortage of lawyers, and millions of unresolved court cases. Generative AI models like OpenAI GPT-4.1 Mini, Gemini 2.0 Flash, Meta Llama 3 70B, and DeepSeek R1 could potentially democratize legal assistance by providing quick and affordable legal advice. In this study, we collected 250 authentic legal questions from the Facebook group "Know Your Rights," where verified legal experts regularly provide authoritative answers. These questions were subsequently submitted to four four advanced AI models and responses were generated using a consistent, standardized prompt. A comprehensive dual evaluation framework was employed, in which a state-of-the-art LLM model served as a judge, assessing each AI-generated response across four critical dimensions: factual accuracy, legal appropriateness, completeness, and clarity. Following this, the same set of questions was evaluated by three licensed Bangladeshi legal professionals according to the same criteria. In addition, automated evaluation metrics, including BLEU scores, were applied to assess response similarity. Our findings reveal a complex landscape where AI models frequently generate high-quality, well-structured legal responses but also produce dangerous misinformation, including fabricated case citations, incorrect legal procedures, and potentially harmful advice. These results underscore the critical need for rigorous expert validation and comprehensive safeguards before AI systems can be safely deployed for legal consultation in Bangladesh.


REAL: Reading Out Transformer Activations for Precise Localization in Language Model Steering

Zhan, Li-Ming, Liu, Bo, Xie, Chengqiang, Cao, Jiannong, Wu, Xiao-Ming

arXiv.org Artificial Intelligence

Inference-time steering aims to alter a large language model's (LLM's) responses without changing its parameters, but a central challenge is identifying the internal modules that most strongly govern the target behavior. Existing approaches often rely on simplistic cues or ad hoc heuristics, leading to suboptimal or unintended effects. We introduce REAL, a framework for identifying behavior-relevant modules (attention heads or layers) in Transformer models. For each module, REAL trains a vector-quantized autoencoder (VQ-AE) on its hidden activations and uses a shared, learnable codebook to partition the latent space into behavior-relevant and behavior-irrelevant subspaces. REAL quantifies a module's behavioral relevance by how well its VQ-AE encodings discriminate behavior-aligned from behavior-violating responses via a binary classification metric; this score guides both module selection and steering strength. We evaluate REAL across eight LLMs from the Llama and Qwen families and nine datasets spanning truthfulness enhancement, open-domain QA under knowledge conflicts, and general alignment tasks. REAL enables more effective inference-time interventions, achieving an average relative improvement of 20% (up to 81.5%) over the ITI method on truthfulness steering. In addition, the modules selected by REAL exhibit strong zero-shot generalization in cross-domain truthfulness-steering scenarios.


SafeLawBench: Towards Safe Alignment of Large Language Models

Cao, Chuxue, Zhu, Han, Ji, Jiaming, Sun, Qichao, Zhu, Zhenghao, Wu, Yinyu, Dai, Juntao, Yang, Yaodong, Han, Sirui, Guo, Yike

arXiv.org Artificial Intelligence

With the growing prevalence of large language models (LLMs), the safety of LLMs has raised significant concerns. However, there is still a lack of definitive standards for evaluating their safety due to the subjective nature of current safety benchmarks. To address this gap, we conducted the first exploration of LLMs' safety evaluation from a legal perspective by proposing the SafeLawBench benchmark. SafeLawBench categorizes safety risks into three levels based on legal standards, providing a systematic and comprehensive framework for evaluation. It comprises 24,860 multi-choice questions and 1,106 open-domain question-answering (QA) tasks. Our evaluation included 2 closed-source LLMs and 18 open-source LLMs using zero-shot and few-shot prompting, highlighting the safety features of each model. We also evaluated the LLMs' safety-related reasoning stability and refusal behavior. Additionally, we found that a majority voting mechanism can enhance model performance. Notably, even leading SOTA models like Claude-3.5-Sonnet and GPT-4o have not exceeded 80.5% accuracy in multi-choice tasks on SafeLawBench, while the average accuracy of 20 LLMs remains at 68.8\%. We urge the community to prioritize research on the safety of LLMs.


Sex-Fantasy Chatbots Are Leaking a Constant Stream of Explicit Messages

WIRED

Several AI chatbots designed for fantasy and sexual role-playing conversations are leaking user prompts to the web in almost real time, new research seen by WIRED shows. Some of the leaked data shows people creating conversations detailing child sexual abuse, according to the research. Conversations with generative AI chatbots are near instantaneous--you type a prompt and the AI responds. If the systems are configured improperly, however, this can lead to chats being exposed. In March, researchers at the security firm UpGuard discovered around 400 exposed AI systems while scanning the web looking for misconfigurations.


Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis

Salehi, Pegah, Sheshkal, Sajad Amouei, Thambawita, Vajira, Gautam, Sushant, Sabet, Saeed S., Johansen, Dag, Riegler, Michael A., Halvorsen, Pål

arXiv.org Artificial Intelligence

The application of AI in education has gained widespread attention for its potential to enhance learning experiences across disciplines, including psychology [1, 2]. In the context of investigative interviewing, especially when questioning suspected child victims, AI offers a promising alternative to traditional training approaches. These conventional methods, often delivered through short workshops, fail to provide the hands-on practice, feedback, and continuous engagement needed for interviewers to master best practices in questioning child victims [3, 4]. Research has shown that while best practices recommend open-ended questions and discourage leading or suggestive queries [5, 6], many interviewers still struggle to implement these techniques effectively during real-world investigations [7]. The adoption of AI-powered child avatars provides a valuable solution, enabling Child Protective Services (CPS) workers to engage in realistic practice sessions without the ethical dilemmas associated with using real children, while simultaneously offering personalized feedback on their performance [8]. Our current system leverages advanced AI techniques within a structured virtual environment to train professionals in investigative interviewing. Specifically, this system integrates the Unity Engine to generate virtual avatars. Despite the potential advantages of our AI-based training system, its effectiveness largely depends on the perceived realism and fidelity of the virtual avatars used in these simulations [9]. Based on our findings, we observed that avatars generated using Generative Adversarial Networks (GANs) demonstrated higher levels of realism compared to those created with the Unity Engine in several key aspects [10].


A Pornhub Chatbot Stopped Millions From Searching for Child Abuse Videos

WIRED

For the past two years, millions of people searching for child abuse videos on Pornhub's UK website have been interrupted. Each of the 4.4 million times someone has typed in words or phrases linked to abuse, a warning message has blocked the page, saying that kind of content is illegal. And in half the cases, a chatbot has also pointed people to where they can seek help. The warning message and chatbot were deployed by Pornhub as part of a trial program, conducted with two UK-based child protection organizations, to find out whether people could be nudged away from looking for illegal material with small interventions. A new report analyzing the test, shared exclusively with WIRED, says the pop-ups led to a decrease in the number of searches for child sexual abuse material (CSAM) and saw scores of people seek support for their behavior.


Beyond Predictive Algorithms in Child Welfare

Moon, Erina Seh-Young, Saxena, Devansh, Maharaj, Tegan, Guha, Shion

arXiv.org Artificial Intelligence

Caseworkers in the child welfare (CW) sector use predictive decision-making algorithms built on risk assessment (RA) data to guide and support CW decisions. Researchers have highlighted that RAs can contain biased signals which flatten CW case complexities and that the algorithms may benefit from incorporating contextually rich case narratives, i.e. - casenotes written by caseworkers. To investigate this hypothesized improvement, we quantitatively deconstructed two commonly used RAs from a United States CW agency. We trained classifier models to compare the predictive validity of RAs with and without casenote narratives and applied computational text analysis on casenotes to highlight topics uncovered in the casenotes. Our study finds that common risk metrics used to assess families and build CWS predictive risk models (PRMs) are unable to predict discharge outcomes for children who are not reunified with their birth parent(s). We also find that although casenotes cannot predict discharge outcomes, they contain contextual case signals. Given the lack of predictive validity of RA scores and casenotes, we propose moving beyond quantitative risk assessments for public sector algorithms and towards using contextual sources of information such as narratives to study public sociotechnical systems.


Researchers found child abuse material in the largest AI image generation dataset

Engadget

Researchers from the Stanford Internet Observatory say that a dataset used to train AI image generation tools contains at least 1,008 validated instances of child sexual abuse material. The Stanford researchers note that the presence of CSAM in the dataset could allow AI models that were trained on the data to generate new and even realistic instances of CSAM. LAION, the non-profit that created the dataset, told 404 Media that it "has a zero tolerance policy for illegal content and in an abundance of caution, we are temporarily taking down the LAION datasets to ensure they are safe before republishing them." The organization added that, before publishing its datasets in the first place, it created filters to detect and remove illegal content from them. However, 404 points out that LAION leaders have been aware since at least 2021 that there was a possibility of their systems picking up CSAM as they vacuumed up billions of images from the internet.